Hello everyone and welcome back to computer vision lecture series.
This is lecture 8 part 2.
We are talking about dense motion estimation and I am going to continue from the previous
slide previous part of the lecture where we were discussing about error metrics.
I am going to repeat certain our last error metric that we used is trying to use cross
correlation and optimizing cross correlation to as an error metric and the advantage of
using cross correlation. Cross correlation is easy to compute as you can see you can
just multiply the two images one over the other and you can find the cross correlation
between these two images. As discussed this it is called cross correlation because it
is convolution or sorry it is a correlation between two images one is a shifted version
of the other and therefore it is not a auto correlation. The reason for doing a normalization
in cross correlation is simple if you have the two images taken at different instances
during the day. So the exposure of your camera to and the sunlight or the darkness different
settings that can change the color or the pixel value of your image in order to avoid
that and to have a normalized value we do a normalization of the cross correlation.
So this will this method will work with even when you are taking pictures from different
times during the day or different exposure periods. Essentially what happens is let us
say if your image has a negative value and if you are doing a multiplication the other
image with related pixels and if these values are of different signs then this cross correlation
is negative has a lower value which we do not want. Ideally you want that if the locations
are the same if the pixels are the same just the shifted versions you want them to be maximum
right. So by normalizing what we do here is essentially suppress the value of all the
negative values. So if these are the same pixel values this will be maximized and therefore
by maximizing cross correlation we can use it as an error metric also.
Okay, how do we trans how do you estimate translational movement? In translational movement
we assume that our camera is shifted only in one direction horizontally or vertically
and we take this images from this shifted movement of our camera. In order to find this
movement we have to first decide upon the error metric EU and you have to search EU such
that it minimizes this error metric and the most naive way of doing this search for finding
the values of U is doing a full search. So for example if you have a selected region
in both of your images where you want to find this movement you start with the pixel value
of 1 until the maximum pixel value that can be achieved and you search for all those pixel
values in that window and find out error metrics for each and every pixel value and then compute
the error and select the value of U which has the low which gives you the lowest error
value. However this is a very naive approach and it does not include sub pixel movement.
We will talk about what sub pixel movement means but also it is very expensive it is
very computationally expensive as you can imagine for a region of for example 100 x
100 in each image there are almost 200 x 200 values to search and therefore we find other
methods or ways for solving this. One of the ways is using usage of image parameter. We
start with the course test of resolution we do a search in a small neighborhood we find
out or the optimized value of U and then we move or scale up our image and then refine
our estimated value of U. The refinement steps are easy less computationally expensive and
therefore this is a very good method. So what essentially we do here is as you can see this
is the highest resolution image here actually it is a combination of two image where we
can easily see the motion here the image moving left and right. We reduce the resolution by
half again by half and again by half. So in this in the course test of resolution it is
easy to estimate because there are lesser pixel values and we first estimate the value
of U and then we move up in the image pyramid. Visually what we do is we have both this image
left and right image 1 and image 2 as we see here. We reduce their resolution and then
we find out the we find the optical flow or the flow vector U for the course test resolution.
Presenters
Zugänglich über
Offener Zugang
Dauer
00:23:22 Min
Aufnahmedatum
2021-05-03
Hochgeladen am
2021-05-03 17:47:16
Sprache
en-US